Translating with Non-contiguous Phrases
نویسندگان
چکیده
This paper presents a phrase-based statistical machine translation method, based on non-contiguous phrases, i.e. phrases with gaps. A method for producing such phrases from a word-aligned corpora is proposed. A statistical translation model is also presented that deals such phrases, as well as a training method based on the maximization of translation accuracy, as measured with the NIST evaluation metric. Translations are produced by means of a beam-search decoder. Experimental results are presented, that demonstrate how the proposed method allows to better generalize from the training data.
منابع مشابه
A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation
The tree sequence based translation model allows the violation of syntactic boundaries in a rule to capture non-syntactic phrases, where a tree sequence is a contiguous sequence of subtrees. This paper goes further to present a translation model based on non-contiguous tree sequence alignment, where a non-contiguous tree sequence is a sequence of sub-trees and gaps. Compared with the contiguous...
متن کاملA Generalized Reordering Model for Phrase-Based Statistical Machine Translation
Phrase-based translation models are widely studied in statistical machine translation (SMT). However, the existing phrase-based translation models either can not deal with non-contiguous phrases or reorder phrases only by the rules without an effective reordering model. In this paper, we propose a generalized reordering model (GREM) for phrase-based statistical machine translation, which is not...
متن کاملCLUE-Aligner: An Alignment Tool to Annotate Pairs of Paraphrastic and Translation Units
Currently available alignment tools and procedures for marking-up alignments overlook non-contiguous multiword units for being too complex within the bounds of the proposed alignment methodologies. This paper presents the CLUE-Aligner (Cross-Language Unit Elicitation Aligner), a web alignment tool designed for manual annotation of pairs of paraphrastic and translation units, representing both c...
متن کاملContrastive Analysis of Aspectual Oppositions in English and Persian
This article aims at contrasting aspectual oppositions in English and Persian in the context of the novel The Old Man and the Sea, and its translation by Daryabandari (1983) as the data. Unlike English, in Persian perfective and imperfective forms are morphologically marked. While the vast majority of English simple past forms are translated into Persian by past perfective forms, only less than...
متن کاملWerdy: Recognition and Disambiguation of Verbs and Verb Phrases with Syntactic and Semantic Pruning
Word-sense recognition and disambiguation (WERD) is the task of identifying word phrases and their senses in natural language text. Though it is well understood how to disambiguate noun phrases, this task is much less studied for verbs and verbal phrases. We present Werdy, a framework for WERD with particular focus on verbs and verbal phrases. Our framework first identifies multi-word expressio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005